[ISSUE-153] Add blocking poll into python bindings#154
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds a poll() method to the Python LogScanner bindings, enabling incremental, timeout-based record reading with a configurable timeout parameter.
Changes:
- Added
LogScanner.poll(timeout_ms)method with input validation and timeout handling - Implemented
create_empty_table()helper method to return properly-schemed empty tables on timeout - Added example usage demonstrating the new poll-based consumption pattern
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| bindings/python/src/table.rs | Implements the new poll() method with validation, error handling, and empty table creation helper |
| bindings/python/example/example.py | Adds example demonstrating usage of the new poll() method |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
luoyuxia
left a comment
There was a problem hiding this comment.
@fresh-borzoni Thanks for the pr. Only one question. PTAL
|
|
||
| let timeout = Duration::from_millis(timeout_ms as u64); | ||
| let scan_records = py | ||
| .detach(|| TOKIO_RUNTIME.block_on(async { self.inner.poll(timeout).await })) |
There was a problem hiding this comment.
actually, I'm thinking can we use scanner#poll_batches directly
There was a problem hiding this comment.
makes sense, thank you!
|
i need #211 to be merged, so that I can use offsets in |
55f7acb to
4cb28b5
Compare
|
@luoyuxia thanks for the review. Addressed comment. I updated to use new poll_batches everywhere, since it's more efficient and we avoid conversions when we don't need them. Also added missing slicing to to_arrow PTAL 🙏 |
4cb28b5 to
0333dd5
Compare
luoyuxia
left a comment
There was a problem hiding this comment.
@fresh-borzoni Thanks for update. Only one minor comment
|
@luoyuxia Thank you for the review |
luoyuxia
left a comment
There was a problem hiding this comment.
@fresh-borzoni Thanks for update. LGTM!
Purpose
Linked issue: close #153
Add a
poll()method to Python LogScanner for incremental, timeout-based record reading.Brief change log
LogScanner.poll(timeout_ms)method to Python bindingsAPI and Format
LogScanner.poll(timeout_ms: int) -> pyarrow.Table